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Abstract — In this paper, we analyze Bayes fusion rule in 
details from a fusion standpoint, as well as the emblematic 
Dempster’s rule of combination introduced by Shafer in his 
Mathematical Theory of evidence based on belief functions. We 
propose a new interesting formulation of Bayes rule and point 
out some of its properties. A deep analysis of the compatibility of 
Dempster’s fusion rule with Bayes fusion rule is done. We show 
that Dempster’s rule is compatible with Bayes fusion rule only in 
the very particular case where the basic belief assignments (bba’s) 
to combine are Bayesian, and when the prior information is 
modeled either by a uniform probability measure, or by a vacuous 
bba. We show clearly that Dempster’s rule becomes incompatible 
with Bayes rule in the more general case where the prior is truly 
informative (not uniform, nor vacuous). Consequently, this paper 
proves that Dempster’s rule is not a generalization of Bayes fusion 
rule. 



Keywords — Information fusion, Probability theory, Bayes fusion 
rule, Dempster’s fusion rule. 



I. Introduction 

In 1979, Lotfi Zadeh questioned in [1] the validity of the 
Dempster’s rule of combination [2], [3] proposed by Shafer in 
Dempster-Shafer Theory (DST) of evidence [4]. Since more 
than 30 years many strong debates [5], [6], [7], [8], [9], [10], 
[11], [12], [13], [14], [15] on the validity of foundations of 
DST and Dempster’s rule have bloomed. The purpose of this 
paper is not to discuss the validity of Dempster’s rule, nor 
the foundations of DST which have been already addressed in 
previous papers [16], [17], [18]. In this paper, we just focus 
on the deep analysis of the real incompatibility of Dempster’s 
rule with Bayes fusion rule. Our analysis supports Mahler’s 
one briefly presented in [19]. 



Bayes fusion rule. We present the structure of this rule derived 
from the classical definition of the conditional probability in a 
new uncommon interesting form that will help us to analyze its 
partial similarity with Dempster’s rule proposed by Shafer in 
his mathematical theory of evidence [4]. We will show clearly 
why Dempster’s rule fails to be compatible with Bayes rule in 
general. 



A. Conditional probabilities 



Let us consider two random 
tional probability mass functions 
are defined 1 (assuming P{X) > 



P{X\Z) = 



P{X n Z) 
P{Z) 



and 



events X and Z. The condi- 
(pmfs) P{X\Z) and P(Z\X) 
0 and P{Z) > 0) by [20]: 



P(Z\X) = 



p(x n z) 

P(X) 



( 1 ) 



From Eq. (1), one gets P(X D Z) = P(X\Z)P(Z) = 
P(Z\X)P(X), which yields to Bayes Theorem: 



P(X\Z) 



P(Z\X)P(X) 

W) 



and P(Z\X) 



P(X\Z)P{Z) 

P(X) 

( 2 ) 



where P(X) is called the a priori probability of X, and 
P(Z\X) is called the likelihood of X. The denominator P{Z) 
plays the role of a normalization constant warranting that 
^ i=1 P(X = Xi\Z ) = 1. In fact P(Z) can be rewritten as 



N 

P(Z) = ^P(Z \X = Xi ) p {x = Xi) (3) 

i=l 



The set of the N possible exclusive and exhaustive outcomes 
of X is denoted 0(A) = {xi,i = 1,2, ... ,N}. 



This paper is organized as follows. In section II, we recall 
basics of conditional probabilities and Bayes fusion rale with 
its main properties. In section III, we recall the basics of belief 
functions and Dempster’s rule. In section IV, we analyze in 
details the incompatibility of Dempster’s rule with Bayes rule 
in general and its partial compatibility for the very particular 
case when prior information is modeled by a Bayesian uniform 
basic belief assignment (bba). Section V concludes this paper. 



B. Bayes parallel fusion rule 

In fusion applications, we are often interested in computing 
the probability of an event X given two events Z\ and Z 2 
that have occurred. More precisely, one wants to compute 
P(X|Zi n Z 2 ) knowing P{X\Z{) and P(X\Z 2 ), where X 
can take N distinct exhaustive and exclusive states Xi, i = 
1,2 ,...,N. Such type of problem is traditionally called a 
fusion problem. The computation of P(X\Z\ n Z 2 ) from 



II. Conditional probabilities and Bayes fusion 

In this section, we recall the definition of conditional prob- 
ability [20], [21] and present the principle and the properties of 



1 For convenience and simplicity, we use the notation P(X\Z) instead of 
P(X = x\Z = z). and P(Z\X) instead of P(Z = z\X = x) where x and 
z would represent precisely particular outcomes of the random variables X 
and Z. 
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P{X\Z\) and P(X\Z 2 ) cannot be done in general without the 
knowledge of the probabilities P(X) and P(X\ZiUZ 2 ) which 
are rarely given. However, P(X\Z\ D Z 2 ) becomes easily 
computable by assuming the following conditional statistical 
independence condition expressed mathematically by: 

(HI) : P(Z 1 n Z 2 \X) = P{Z 1 \X)P{Z 2 \X) (4) 

With such conditional independence condition (Al), then from 
Eq. (1) and Bayes Theorem one gets: 



P{x\z 1 nz 2 ) = 



P{z i nz 2 r x) _ p(Z! n z 2 \x)P(x) 

P(Z! n z 2 ) ~ p(Zi n z 2 ) 

P(Z 1 \X)P(Z 2 \X)P{X) 

T,?=i p (Zi\X = Xi)P{Z 2 \X = Xi)P{X = Xi) 



Using again Eq. (2), we have: 

P(Z lW = P ^ Z J v P[Zl) and P(Z 2 \X) = P W*) P ™ 



P(X) 



P(X) 



and the previous formula of conditional probability P(X\Z\ n 
Z 2 ) can be rewritten as: 

P(X\Z 1 )P(X\Z 2 ) 

P{X\Zi fl Z 2 ) = p(x=x i \z 1 )P(x=x i \z 2 ) ^ 

4 ^ 1=1 P(X=Xi) 

The rule of combination given by Eq. (5) is known as Bayes 
parallel (or product) rule and dates back to Bernoulli [22]. In 
the classification framework, this formula is also called the 
Naive Bayesian Classifier because it uses the assumption (Al) 
which is often considered as very unrealistic and too simplistic, 
and that is why it is called a naive assumption. The Eq. (5) 
can be rewritten as: 



P{x\Zi n z 2 ) = 



l 



P(X\Z x ) ■ P{X\Z 2 ) (6) 



K(X, Z U Z 2 ) 

where the coefficient I\(X , Z i, Z 2 ) is defined by: 



N 



K{X,Z u Z 2 )±P{X)-Y, 



P{X = Xi \Z{)P{X = Xi\Z 2 ) 



P{X = Xi) 



(7) 



C. Symmetrization of Bayes fusion rule 



The expression of Bayes fusion rule given by Eq. (5) 
can also be symmetrized in the following form that, quite 
surprisingly, rarely appears in the literature: 



P(X|Zi) _ P(X\Z 2 ) 

y/PC 0 ' 



p{x\Z\ n z 2 ) N p( X =x i \z 1 ) p(x= 

2-^i— 



i\Z 2) 



( 8 ) 



* =1 y/P(X= Xi ) y/P(X=Xi) 



or in an equivalent manner: 
1 



p{x\Ztnz 2 ) = 



P(X\Z r) P(X\Z 2 ) 



K'(Zi,Z 2 ) y/P(X) y/P{X) 
where the normalization constant K'(Zi, Z 2 ) is given by: 



(9) 



N 



7 1, y :) A V P( ; Y = XilZl 1 P(X X,]Z2 ] (10) 
h VP(X = *0 = *0 



We call the quantity A 2 (X = xf) = P [ X X '\ Z A ■ 

V P(X=Xi) 

F [X—x t \z 2 ) enter j n g j n gq (-jQj the Ag reemen t Factor on 



^P(X=Xi) 



= Xi of order 2, because only two posterior pmfs are used 
in the derivation. A 2 (X = xf) corresponds to the posterior 
conjunctive consensus on the event X = x, taking into account 
the prior pmf of X. The denominator of Eq. (8) measures 
the level of the Global Agreement (GA) of the conjunctive 
consensus taking into account the prior pmf of X. It is 
2- 



denoted 2 GA 2 . 



N 



A a_ ^ P(X — Xj 1 \Zi) P(X — Xj 2 1 Z 2 ) 
2 ~ n,i 2 kii=i* VP(X = x h ) ‘ VP(X = x i2 ) 

N 



sr P(X = x i \Z 1 ) P{X = Xj\Z 2 ) „, tr7 ^ 
^ y/ P(X = Xi) y/P(X = Xi) 1 15 2> 



( 11 ) 



In fact, with assumption (Al), the probability P{X\Z\ n Z 2 ) 
given in Eq. (9) is nothing but the simple ratio of the agreement 
factor A 2 {X) (conjunctive consensus) on X over the global 



agreement GA 2 = A 2 (X = Xi), that is: 

A 2 (X) 



p(x\Zt nz 2 ) = 



ga 2 



(12) 



The quantity GC 2 given in Eq. (13) measures the global 
conflict (i.e. the total conjunctive disagreement) taking into 
account the prior pmf of X. 



N 



GC = V — Xj x \Z{) ' P(X — Xj 2 \ Z 2 ) 

V P ( X = *n) sJP(X = x i2 ) 

• Generalization to P(X\Z\ fl Z 2 n . . . n Z s ) 

It can be proved that, when assuming conditional independence 
conditions, Bayes parallel combination rule can be generalized 
for combining s > 2 posterior pmfs as: 



P(x\z x n . . . n z„) = 



l 



K(X, Zt,...,Z s ) 



J P(x\z k ) 



k = 1 



(14) 



where the coefficient K(X, . . . , Z 3 ) is defined by: 



N 



K{X,Z u ...,Z„)±P{X)Y, 



(UUP(x = xi\z k )) 



i= 1 



P(X = Xi) 



(15) 



The symmetrized form of Eq. (14) is: 

1 



P{x\Zt r...nz s ) = 



P{X\Z k ) 

K'{Z \, . . . , z s ) ia yp(x) 



n 



(16) 



with the normalization constant K'(Z i, . . . , Z s ) given by: 

, y-nry r? \ A TT P {.X = Xi\Z]f) 

i,Zi ZJ “L n ,w^ <17) 

2 The index 2 is introduced explicitly in the notations because we consider 
only the fusion of two posterior pmfs. 
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The generalization of A 2 (X), GA 2 , and GC 2 provides the 
agreement A S (X) of order s, the global agreement GA S and 
the global conflict GC S for s sources as follows: 



A S (X 



= Xi)=\ 



1 



P(X=Xj\Z k ) 

i/P(X = Xi ) 



GA S 4 



N 

E 

ii,...,i 3 =l|ii = ...=i 3 



P(X = x il \Z 1 ) 
i/P(X=x n ) 



P(X = x is \Z s ) 
i/P(X = x is ) 




P(X = x il \Z 1 ) 
i/P(X = Xil ) 



P(X = x iB \Z s ) 
i/P(X = x is ) 



• Symbolic representation of Bayes fusion rule 

The (symmetrized form of) Bayes fusion rule of two posterior 
probability measures P(X|Xi) and P(X\Z 2 ), given in Eq. (9), 
requires an extra knowledge of the prior probability of X. For 
convenience, we denote symbolically this fusion rule as: 

P{X\Z 1 PZ 2 ) = Baye«(P(X|Z 1 ),P(X|Z 2 );P(X)) (18) 

Similarly, the (symmetrized) Bayes fusion rule of s > 2 
probability measures P(X\Z k ), k = 1, 2, . . . , s given by Eq. 
(16), which requires also the knowledge of P(X), will be 
denoted as: 



P(X\ZiG. . .n Z s ) = Bayes(P(X\Zi), . . . , P(X\Z S ); P(X)) 

• Particular case: Uniform a priori pmf 

If the random variable X is assumed as a priori uniformly 
distributed over the space of its N possible outcomes, then 
the probability of X is equal to P(X = Xi ) = 1/X for i = 
1,2,..., N. In such p artic ular c ase, all the prior pr obabi lities 
values yj P{X = xi) = y/1 /N and y/P(X = Xi) = y/l/N 
can be simplified in Bayes fusion formulas Eq. (9) and Eq. 
(10). Therefore, Bayes fusion formula (9) reduces to: 



P(x\Zi n z 2 ) = 



P(X\Zi)P(X\Z 2 ) 



EE p ( x = Xi\Zi)P(X = Xi\Z 2 ) 

By convention, Eq. (19) is denoted symbolically as: 

P{X\Z x n Z 2 ) = Bayes(P(X\Zi), P{X\Z 2 )) (20) 

Similarly, Bayes(P(X\Zi ), . . . , P(X\Z S )) rule defined with 
an uniform a priori pmf of X will be given by: 



(19) 



p(x\z 1 n...nz s ) = 









GA unif 4 



QQunif 



N 

E - 

i,j=i\i=j 

N 

E 



= Xi\Z k ) 


(21) 


one can redefine 


as: 




= xj\Z 2 ) 


(22) 


= x j\Z 2 ) 


(23) 



Because J2iLi P ( x = x i \ z i) = 1 and EjLi p ( x = 
Xj\Z 2 ) = 1, then 

N N 

1 = (]TP(X = Xi\Zi))(£ P(X = Xj \ z 2 )) 

i= 1 3 = 1 

N 

= J2 p ( x = x i\Zi)P(X = Xj\Z 2 ) 
i,j = 1 
N 

= J2 p ( x = ^\Zi)P(X = x 0 \Z 2 ) 

i,j=l\i=j 

N 

+ ^ P(X=x i \Z l )P(X=x j \Z 2 ) 

Al=l| 



Therefore, one has always GA^ nl ^ +GC^ nl ^ = 1 when P(X) 
is uniform, and Eq. (19) can be expressed as: 



p(x\z 1 nz 2 ) = 



P(X|Z!)P(X|Z 2 ) P(X|Z!)P(X|Z 2 ) 



GA 



unif 



1 - GC'"' A f 



(24) 

By a direct extension, one will have: 

UUP(x\z k ) nEi P{x\z k ) 

1 - Gcr if 

(25) 



P(x\Zi n . . . n z s ) = 



N 



GA u s ni f = 

ii,...,i 3 = l|ii = ...=i, 



GA umf 

P(X = x il \Z 1 )...P(X = x ig \Z s ) 



GCr if = 1 - GAT zf 



Remark 1: The normalization coefficient corresponding to the 
global conjunctive agreement GA^ ni f can also be expressed 
using belief function notations [4] as: 

G A unif = P(X = Xil I Z X ) . . . P(X = x is I Z a ) 

Xi x ,...,Xi s G©(X) 

x il n...ni is 7^0 

and the global disagreement, or total conflict level, is given 
by: 

GC uni f = Y P(X = x n | Zi) . . . P(X = x ia | Z s ) 

X i 1 

Xi 1 n...Da:i s =0 



D. Properties of Bayes fusion rule 

In this subsection, we analyze Bayes fusion rule (assuming 
condition (Al) holds) from a pure algebraic standpoint. In 
fusion jargon, the quantities to combine come from sources 
of information which provide inputs that feed the fusion 
rule. In the probabilistic framework, a source s to combine 
corresponds to the posterior pmf P(X\Z S ). In this subsection, 
we establish five interesting properties of Bayes rule. Contrary 
to Dempster’s rule, we prove that Bayes rule is not associative 
in general. 

• (PI) : The pmf P(X) is a neutral element of Bayes fusion 
rule when combining only two sources. 

Proof: A source is called a neutral element of a fusion 
rule if and only if it has no influence on the fusion result. 
P(X) is a neutral element of Bayes rule if and only if 
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Bayes{P{X\Z 1 ),P{X)-P{X)) = P{X \Z ± ). It can be easily 
verified that this equality holds by replacing P(X\Z 2 ) by 
P{X) and P{X = Xi\Z 2 ) by P(X = Xi) (as if the 
conditioning term Z 2 vanishes) in Eq. (5). One can also ver- 
ify that Bayes(P(X),P(X\Z 2 y,P(X)) = P(X\Z 2 ), which 
completes the proof. 

Remark 2: When considering Bayes fusion of more than 
two sources, P(X) doesn’t play the role of a neutral element 
in general, except if P{X) is uniform. For example, let us 
consider 3 pmfs P(X\Zi), P(X\Z 2 ) and P{X\Z 3) to combine 
with formula (14) with P{X) not uniform. When Z 3 vanishes 
so that P(X\Zz) = P(X), we can easily check that: 

Bayes(P(X \Z X ), P(X\Z 2 ),P(X)- P(X)) 

± Bayes(P(X\Zi), P(X\Z 2 )-, P(X)) (26) 



• (P2) : Bayes fusion rule is in general not idempotent. 



Proof: A fusion rule is idempotent if the combination of all 
same inputs is equal to the inputs. To prove that Bayes rule is 
not idempotent it suffices to prove that in general: 



Bayes(P(X \Z X ), P{X \Z{)-, P(X)) ± P{X \Z{) 



From Bayes rule (5), when P{X\Z 2 ) = P(X\Zi ) we clearly 
get in general 



1 ppqzoppqzi) 

P(X) Pjx^z^Pjx^Z!) 

v ’ Z^i= 1 P(X=x i) 



*P{X\Zi) 



(27) 



but when Z\ and Z 2 vanish, because in such case Eq. (27) 
reduces to P{X) on its left and right sides. 



Remark 3: In the particular (two sources) degenerate 
case where Z\ and Z 2 vanish, one has always: 
Bayes(P(X),P(X)- 1 P(X)) = P{X). However, in 

the more general degenerate case (when considering 
more than 2 sources), one will have in general: 
Bayes(P(X),P(X),...,P(X)-P(X)) ± P(X), but 

when P(X) is uniform, or when P(X) is a “deterministic” 
probability measure such that P(X = Xi) = 1 for a given 
Xi £ 0(26) and P(X = Xj) = 0 for all Xj ^ Xi. 

• (P3) : Bayes fusion rule is in general not associative. 



Proof: A fusion rule / is called associative if and only if it 
satisfies the associative law: f(f(x,y),z) = f(x,f(y,z)) = 
f(y,f(x,z)) = f(x,y,z) for all possible inputs x, y and 2. 
Fet us prove that Bayes rule is not associative from a very 
simple example. 



Example 1: Fet us consider the simplest set of outcomes 
{.x'i . x 2 } for X, with prior pmf: 

P(X = xi) = 0.2 and P(X = x 2 ) = 0.8 



and let us consider the three given sets of posterior pmfs: 



P{X = xi|Zi) = 0.1 and P(X = x 2 \Z 1 ) = 0.9 

P(X = xij Z 2 ) = 0.5 and P{X = x 2 | Z 2 ) = 0.5 

P{X = xij Z 3 ) = 0.6 and P(X = x 2 \ Z 3 ) = 0.4 



where the normalization constant K 123 is given by: 



K 123 = 



0.1 0.5 0.6 



0.9 0.5 0.4 



^02^02-^(12 ^08 s/Oli -5^08 



= 0.3750 



Fet us compute the fusion of P(X\Z\) with P(X\Z 2 ) using 
Bayes(P(X\Z 1 ),P(X\Z 2 );P{X)). One has: 



0.1 0.5 



P(X -xi\Z 1 rZ 2 ) - Ki 2 ^/o 3 ^o 3 
P(X = x 2 | z x n z 2 ) = 0 9 °- 5 



K 12 V0I8 \/ol8 

where the normalization constant K 12 is given by: 
0.1 0.5 0.9 0.5 



0.3077 

0.6923 



K \2 = 



v/02 V&2 \/0(8 x/OA 



= 0.8125 



Fet us compute the fusion of P(X\Z 2 ) with P(X \Z 3 ) using 
Bayes(P(X\Z 2 ), P(X\Z 3 ); P(X)). One has 

I P(X = n Z 3 ) = » 0.8571 

\P(X = x,\Z, n Z 3 ) = » 0.1429 

where the normalization constant A 2:1 is given by: 

0.5 0.6 0.5 0.4 

K 23 — — 7= — 1 = H — 7= — 7= — 1.75 
v/02 V 02 



Fet us compute the fusion of P(X\Z{) with P(X\Z 3 ) using 
Bayes{P{X\Zi ) 1 P(X\Z 3 ); P(X)). One has: 



p(x = Xl \z 1 nz 3 ) 

P{X = x 2 \Z 1 nZ 3 ) 



1 0.1 0.6 

K 13 v 4i2 vTTT 
1 0.9 0.4 

K 13 v® V0I8 



= 0.4 
= 0.6 



where the normalization constant K \ 3 is given by: 

0.1 0.6 0.9 0.4 

A13 — — 7 ^=—f= H — — 0.7E 
\/(T2 x/CF2 



Fet us compute the fusion of P(X\Z\ 0 Z 2 ) with P{X\Z 3 ) 
using Bayes(P(X\Z 1 n Z 2 ),P(X\Z 3 )-, P(X)). One has 



P(X = x 1 \(z 1 nZ2)nz 3 ) 
P(X = x 2 \{z 1 nZ2)nz 3 ) 



1 0.3077 0.6 

K( 12)3 \ftl2 V02 

1 0.6923 0.4 

A”(i 2 )3 V0l8 V08 



0.7273 

0.2727 



where the normalization constant i6( 12 )3 is given by 



0.3077 0.6 , 0.6923 0.4 
(12)3 “ V &2 v 7 ^ + v 7 ^ v 7 ^ 



1.26925 



Fet us compute the fusion of P{X\Z{) with P{X\Z 2 0 Z 3 ) 
using Bayes(P{X\Z l ),P{X\Z 2 O Z 3 );P(X)). One has 



P{X = x 1 \z 1 n(z 2 nz 3 )) 
P{X = x2\z 1 n{z 2 nz 3 )) 



1 0.1 0.8571 

A4( 23) \/0 sM 2 
1 0.9 0.1429 

ATi(23) Vol V778 



0.7273 

0.2727 



where the normalization constant 163(23) i s given by 



0.1 0.8571 

1(23)_ V(B 2 VM 



0.9 0.1429 
V(T8 v/OA 



0.58931 



Bayes fusion Payes(P(X|Zi), )P(X\Z 2 ), P(X\Z 3 )- P(X)) Let us compute the fusion of P(X\Z 1 0 Z 3 ) with P{X\Z 2 ) 
of the three sources altogether according to Eq. (16) provides: using Bayes(P(X\Zi 0 Z 3 ), P(X\Z 2 ); P(X)). One has 



p(x — xi \Zi n z 2 n z 3 ) 
p{x = x 2 \Z\ n z 2 n z 3 ) 



1 0.1 0.5 0.6 

A'i23 ^0.2 \/0.2 Vo. 2 
1 0.9 0.5 0.4 

A'123 "v^ol v 4 k 8 "v^ol 



0.40 

0.60 



P{x = x 1 \(z 1 nz 3 )nz 2 ) 
P{x = x 2 \{z 1 pZ 3 )nZ 2 ) 



1 0.4 0.5 

I<( 13)2 V0l2 \/0l2 
1 0.6 0.5 

a'(i 3 )2 Vol 8 Vols 



0.7273 

0.2727 
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where the normalization constant K^ 13 ^ 2 is given by 



r , 0.4 0.5 , 0.6 0.5 

(13)2 “ ToaToa + To! TO 



1.375 



Therefore, one sees that even if in our example one has 

f(x,f{y,z)) = f{f(x,y),z) = f(y,f(x,z)) because 

P(x\{z x n z 2 ) n z 3 ) = p(x\z 1 n (z 2 n z 3 )) = P{x\z 2 n 

(Z 1 PI Z 3 )), Bayes fusion rule is not associative since: 



P{x\{z l n z 2 ) n z 3 ) ± P{x\z 1 nz 2 n z 3 ) 
P(x\z Y n (z 2 n z 3 )) ± p{x \z x nz 2 n z 3 ) 
p(x\z 2 n {z x n z 3 )) ± P[x\z x nz 2 n z 3 ) 



also) do not matter in Bayes fusion result. What really matters 
is only the proportions of relative agreement factors. 

Example 2: To illustrate this property, let us consider 
Bayes fusion rule applied to two distinct sets 3 of sources 
represented by Bayes(P(X\Z x ) : P(X\Z 2 ); P(X)) and by 
Bayes(P'(X \Z X ), P'(X\Z 2 ); P(X)) with the following prior 
and posterior pmfs: 

P(X = xi) = 0.2 and P{X = x 2 ) = 0.8 

f P{X = xi|Zi) « 0.0607 and P(X = x 2 \Z 1 ) « 0.9393 
\P(X = xij Z 2 ) « 0.6593 and P{X = x 2 \Z 2 ) « 0.3407 



• (P4) : Bayes fusion rule is associative if and only if P(X) 
is uniform. 



Proof: If P{X) is uniform, Bayes fusion rule is given by Eq. 
(21) which can be rewritten as: 



P(.Y|Zin...nZ s ) 



P(x\Zs)U a kZ\P(x\z k ) 

Eti nx = x t \z s ) nti p(x = Xi \z h ) 



By introducing the term 1/ YhiLi fl/Un P{X = Xi\Zk) in 
numerator and denominator of the previous formula, it comes: 



P(x\z 1 n. . .nz a ) = 



E N 

i=l 



nt=i p(*\Zk) 
^? =1 n s k z\p(x= Xi \z k ) 

nj.i) p(x=xj\z k ) 



p(x= Xi \z k ) 



P(X\Z a ) 



P(X = Xi\Z a ) 



which can be simply rewritten as: 



P(x\Zi n ...nz a ) 



P{X\Z! n . . . n z a _i)p(x|z 8 ) 

Ef = i P(x = xi\z 1 n . . . n z.^Pix = Xi \z s ) 



Therefore when P{X) is uniform, one has: 



Bayes(P(X\Zi),...,P(X\Z B )) = 
Bayes(Bayes(P(X\Z x ), . . . , P{X\Z S ^)), P{X\Z S )) 



The previous relation was based on the decomposition of 
nt=i P(X\Z k ) as P(X\Z s )U s k Z\P(X\Z k ). This choice of 
decomposition was arbitrary and chosen only for convenience. 
In fact UUP(X\Z k ) can be decomposed in s different 
manners, as P(X\Zj) X]X=i\k±j P( x \Z k ), j = 1, 2, . . . s and 
the similar analysis can be done. In particular, when s = 3, 
we will have: 

Saye S (P(X|Z 1 ),P(X|Z 2 ),P(X|Z 3 )) = 

Bayes(Bayes(P(X\Zi), P(X\Z 2 )), P{X\Z 3 )) 

= Bayes(P{X\Z 1 ),Bayes{P(X\Z 2 ),P(X\Z 3 ))) 

which completes the proof. 

• (P5) : The levels of global agreement and global conflict 
between the sources do not matter in Bayes fusion rule. 

Proof: This property seems surprising at first glance, but, 
since the results of Bayes fusion is nothing but the ratio 
of the agreement on x* ( i = 1,2,..., TV) over the global 
agreement factor, many distinct sources with different global 
agreements (and thus with different global conflicts) can yield 
same Bayes fusion result. Indeed, the ratio is kept unchanged 
when multiplying its numerator and denominator by same non 
null scalar value. Consequently, the absolute levels of global 
agreement between the sources (and therefore of global conflict 



f P'(X = xi|Zi) « 0.8360 and P'(X = x 2 \Z x ) « 0.1640 
{P'(X = xij Z 2 ) « 0.0240 and P'(X = x 2 |Z 2 ) « 0.9760 

Applying Bayes fusion rule given by Eq. (5), one gets for 

Bayes(P(X\Zi), P{X\Z 2 )-P{X)): 

(P{X = Xi\Z x Cl Z 2 ) = 0 ^^ = 1/3 
\P(X = x 2 \Z x n Z 2 ) = = 2/3 1 

Similarly, one gets for Bayes(P'(X \Z X ), P'(X\Z 2 )] P(X)) 

[p'{x = x x \z x n z 2 ) = 0^3 = i/3 
\P'(X = x 2 |Z 1 nZ 2 ) = 5 ^=2/3 

Therefore, one sees that Bayes(P(X \Z X ), P(X |Z 2 ); P(X)) = 
Bayes(P r (X\Z x ), P' (X\Z 2 ); P(X)) even if the levels of 
global agreements (and global conflicts) are different. In this 
particular example, one has: 



f ( GA 2 = 0.60) ^ (GA' 2 = 0.30) 
\ {GC 2 = 1.60) ^ {GC 2 = 2.05) 



In summary, different sets of sources to combine (with differ- 
ent levels of global agreement and global conflict) can provide 
exactly the same result once combined with Bayes fusion 
rule. Hence the different levels of global agreement and global 
conflict do not really matter in Bayes fusion rule. What really 
matters in Bayes fusion rule is only the distribution of all the 
relative agreement factors defined as A S {X = Xi)/GA S . 



III. Belief functions and Dempster’s rule 

The Belief Functions (BF) have been introduced in 1976 by 
Glenn Shafer in his mathematical theory of evidence [4], also 
known as Dempster-Shafer Theory (DST) in order to reason 
under uncertainty and to model epistemic uncertainties. We 
will not present in details the foundations of DST, but only 
the basic mathematical definitions that are necessary for the 
scope of this paper. The emblematic fusion rule proposed by 
Shafer to combine sources of evidences characterized by their 
basic belief assignments (bba) is Dempster’s rule that will be 
analyzed in details in the sequel. In the literature over the years, 
DST has been widely defended by its proponents in arguing 
that: 1) Probability measures are particular cases of Belief 

3 The values chosen for P(X\Z x ), P(X\Z 2 ), P'(X|Zi), P’(X\Z 2 ) here 
have been approximated at the fourth digit. They can be precisely determined 
such that the expressions for P(X\Z± (IZ 2 ) and P'(X\Z\ C\Z 2 ) as given in 
Eqs. (28) and (29) hold. For example, the exact value of P(x\ IZ 2 ) is obtained 
by solving a polynomial equation of degree 2 having as a possible solution 
P{x x \Z 2 ) = 4(0.72 + V0.72 2 - 4 x 0.04) = 0.659332590941915 m 
0.6593, etc. 
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functions; and 2) Dempster’s fusion rule is a generalization 
of Bayes fusion rule. Although the statement 1) is correct 
because Probability measures are indeed particular (additive) 
Belief functions (called as Bayesian belief functions), we will 
explain why the second statement about Dempster’s rule is 
incorrect in general. 



A. Belief functions 

Let 0 be a frame of discernment of a problem under 
consideration. More precisely, the set 0 = {9±, 0 2 , • ■ • , On} 
consists of a list of N exhaustive and exclusive elements Oi, 
i = 1, 2, . . . , N. Each 0, represents a possible state related to 
the problem we want to solve. The exhaustivity and exclusivity 
of elements of 0 is referred as Shafer’s model of the frame 
0. A basic belief assignment (bba), also called a belief mass 
function, m(.) : 2° — >• [0, 1] is a mapping from the power set 
of 0 (i.e. the set of subsets of 0), denoted 2 s , to [0, 1], that 
verifies the following conditions [4] : 

m(0) = 0 and ^ m(X) = 1 (31) 

xe2 e 



The quantity m(X) represents the mass of belief exactly 
committed to X. An element X £ 2 e is called a focal element 
if and only if m(X) > 0. The set T(m) = {X £ 2 e | m(X) > 
0} of all focal elements of a bba m(.) is called the core of 
the bba. A bba m(.) is said Bayesian if its focal elements 
are singletons of 2 e . The vacuous bba characterizing the total 
ignorance denoted 4 I t = 0\ U 62 U . . . U 9n is defined by 
m v (.) : 2 e — > [0; 1] such that m v (X) = 0 if X 0, and 
m v {l t ) = 1. 

From any bba m(.), the belief function Beli.) and the 
plausibility function Pl(.) are defined for \/X £ 2® as: 

(Bel(1 0 = Ey e2 e \YCX m(Y ) 

\Pl(X) = 

XVe2«»|A'nY/0 m(Y) 

Bel(X) represents the whole mass of belief that comes from 
all subsets of 0 included in X. It is interpreted as the 
lower bound of the probability of X, i.e. P m i n (X). Bel(.) 
is a subadditive measure since Bel {Of) < 1. Pl(X) 

represents the whole mass of belief that comes from all 
subsets of 0 compatible with X (i.e., those intersecting X). 
Pl(X) is interpreted as the upper bound of the probability 
of X, i.e. Pmax(X). Pl(.) is a superadditive measure since 
Yhe > 1. Bel(X) and Pl{X) are classically seen 

[4] as lower and upper bounds of an unknown probability 
P{.), and one has the following inequality satisfied MX £ 2 e : 
Bel(X) < P{X ) < Pl(X). The belief function Bel(.) (and 
the plausibility function Pl(.)) built from any Bayesian bba 
m(.) can be interpreted as a (subjective) conditional probability 
measure provided by a given source of evidence, because if 
the bba m(.) is Bayesian the following equality always holds 
[4]: Bel(X) = Pl(X) = P{X). 



B. Dempster’s rule of combination 

Dempster’s rule of combination, denoted DS rule 5 is a 
mathematical operation, represented symbolically by 0 , which 
corresponds to the normalized conjunctive fusion rule. Based 
on Shafer’s model of 0, the combination of s > 1 independent 
and distinct sources of evidences characterized by their bba 
mi(.), ..., m s (.) related to the same frame of discernment 
0 is denoted rriDs(-) = [mi 0 ... 0 m s ](.). The quantity 
rriDs(-) is defined mathematically as follows: tods(0) — 0 
and MX ^ 0 £ 2 e 



mDs{X ) = 



m\2...s{X) 
1 - K 12 ...s 



(33) 



where the conjunctive agreement on X is given by: 



m 12 ... s {X) = ^2 mi(Xi)m 2 (X 2 ) . . . m s {X s ) 

X lt X 2 ,...,X s £2 e 

x 1 nx 2 n...nx a =x 

(34) 

and where the global conflict is given by: 



K 12 ... s = J2 m 1 (X 1 )m 2 (X 2 )...m s (X s ) (35) 

Ai,A 2 ,...,A s G2 e 

ATinx 2 n...n.Y s =0 



When A'i 2 ... s = 1, the s sources are in total conflict and their 
combination cannot be computed with DS rule because Eq. 
(33) is mathematically not defined due to 0/0 indeterminacy 
[4], DS rule is commutative and associative which makes it 
very attractive from engineering implementation standpoint. 

It has been proved in [4] that the vacuous bba m v {.) 
is a neutral element for DS rule because [m 0 = 

[m v © m](.) = m(.) for any bba m(.) defined on 2 . This 
property looks reasonable since a total ignorant source should 
not impact the fusion result because it brings no information 
that can be helpful for the discrimination between the elements 
of the power set 2 e . 



IV. Analysis of compatibility of Dempster’s rule 
with Bayes rule 

To analyze the compatibility of Dempster’s rule with 
Bayes rule, we need to work in the probabilistic framework 
because Bayes fusion rule has been developed only in this 
theoretical framework. So in the sequel, we will manipulate 
only probability mass functions (pmfs), related with Bayesian 
bba’s in the Belief Function framework. This perfectly justifies 
the restriction of singleton bba as a prior bba since we want 
to manipulate prior probabilities to make a fair comparison 
of results provided by both rules. If Dempster’s rule is a true 
(consistent) generalization of Bayes fusion rule, it must provide 
same results as Bayes rule when combining Bayesian bba’s, 
otherwise Dempster’s rule cannot be fairly claimed to be a 
generalization of Bayes fusion rule. In this section, we analyze 
the real (partial or total) compatibility of Dempster’s rule with 
Bayes fusion rule. Two important cases must be analyzed 
depending on the nature of the prior information P(X) one 
has in hands for performing the fusion of the sources. These 



4 The set {0i , 02 1 ■ • • > 0 ,v } and the complete ignorance 0 1 U 0 2 U . . . LJ 0 ; y 
are both denoted © in DST. 



5 We denote it DS rule because it has been proposed historically by Dempster 
[2], [3], and widely promoted by Shafer in the development of DST [4]. 
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sources to combine will be characterized by the following 
Bayesian bba’s: 

TOi(.) = {mi(0i) = P(X = Xi\Zi),i = 1,2,..., N} 
m s (.) = { m s {6i ) = P(X = Xi\Z s ),i = 1,2, ... ,1V} 

(36) 

The prior information is characterized by a given bba denoted 
as mo(.) that can be defined either on 2°, or only on 0 if 
we want to deal for the needs of our analysis with a Bayesian 
prior. In the latter case, if mo(.) = {mo(9i) = P(X = Xi),i = 
1, 2, ... , N} then mo(.) plays the same role as the prior pmf 
P(X) in the probabilistic framework. 

When considering a non vacuous prior mo(.) ^ to„(.), we 
denote Dempster’s combination of s sources symbolically as: 

m DS (-) = DS(mi{.), . . . ,m s (.);m 0 (.)) 

When the prior bba is vacuous mo(.) = m v (.) then mo(.) 
has no impact on Dempster’s fusion result, and so we denote 
symbolically Dempster’s rule as: 

m DS (.) = DS(mi (. ),..., m s (.);m v (.)) 

= DS(m 1 (.), . . . , m s (.)) 

A. Case 1: Uniform Bayesian prior 

It is important to note that Dempster’s fusion formula 
proposed by Shafer in [4] and recalled in Eq. (33) makes no 
real distinction between the nature of sources to combine (if 
they are posterior or prior information). In fact, the formula 
(33) reduces exactly to Bayes rule given in Eq. (25) if the bba’s 
to combine are Bayesian and if the prior information is either 
uniform or vacuous. Stated otherwise the following functional 
equality holds 

DS(mi{.), . . . ,m s (.);m 0 (.)) = 

Bayes(P(X\Zi ), . . . , P(X\Z S ); P(X)) (37) 

as soon as all bba’s rrii(.), i = 1,2, . . . , s are Bayesian and 
coincide with P(X\Zi), P(X) is uniform, and either the prior 
bbamo(.) is vacuous (m o(.) = m„(.)), ormo(.) is the uniform 
Bayesian bba. 

Example 3: Let us consider 0(TT) = {xi, 22 , £ 3 } with two 
distinct sources providing the following Bayesian bba’s 

( mi(xi) = P(X = xi\Zi) = 0.2 f 7712 ( 21 ) = 0.5 

< 7771 ( 22 ) = P{X = 22 ^ 1 ) = 0.3 and < 7712 ( 22 ) = 0.1 
[ 7771 ( 23 ) = P(X = 2 3 |Zi) = 0.5 [ 7772 ( 23 ) = 0.4 

• If we choose as prior mo(.) the vacuous bba, that is 7770(21 U 
I 2 U 13 ) = 1 , then one will get 

m D s { xi ) = 7771(21)7772(21)7770(21 U 2 2 U 23) 

= 1=070.2.0.5-1 = Mo « 0.3030 
m D s ( x 2 ) = ^^10^^ 7771(22)7772(22)7770(21 U 2 2 U 23) 
= T=5^0.3 • 0.1- l = gf« 0.0909 
777 x 75 ' (23) = !_ K oL uou s 7771(23)7772(23)7710(21 U 2 2 U 23) 

= ot0.5- 0.4- 1 = §§ ^ 0.6061 



with 

K vacuous _ 1 _ TOl (a;i)7772 ( 2 i)? 77 0 ( 2 i U 2 2 U 2 3 ) 

— 7771(22)7772(22)7770(21 U 22 U 23) 

— 7771(23)7772(23)7770(21 U 2 2 U 23) = 0.67 

• If we choose as prior 7770 (.) the uniform Bayesian bba given 
by 7770(21) = 7770(22) = 7770(23) = 1/3, then we get 

'mDs(xi) = 1A .„L JO om 7771 ( 21 ) 7772 ( 21 ) 7770 ( 21 ) 

= i^ogO.2 ■ 0.5 • 1/3 = « 0.3030 

m DS { x 2 ) = — j^unif orrri ■7771(22)7772(22)7770(22) 

= t^0.3 • 0.1 • 1/3 = ~ 0.0909 

m DS (x 3 ) = — 

j^-uni f orm ■7771(23)7772(23)7770(23) 

= 14^0.5-0.4.1/3= « 0.6061 

where the degree of conflict when 7770 (.) is Bayesian and 
uniform is now given by = 0.89. 

Clearly K™™f orrn ^ ^™c»o«s, | 3Ut f us j on results 
obtained with two distinct priors mo(.) (vacuous or uniform) 
are the same because of the algebraic simplification by 1/3 in 
Dempster’s fusion formula when using uniform Bayesian bba. 
When combining Bayesian bba’s mi(.) and m 2 (.), the vacuous 
prior and uniform prior mo(.) have therefore no impact on the 
result. Indeed, they contain no information that may help to 
prefer one particular state 2 * with respect to the other ones, 
even if the level of conflict is different in both cases. So, the 
level of conflict doesn’t matter at all in such Bayesian case. 
As already stated, what really matters is only the distribution 
of relative agreement factors. It can be easily verified that we 
obtain same results when applying Bayes Eq. (14), or (16). 

Only in such very particular cases (i.e. Bayesian bba’s, 
and vacuous or Bayesian uniform priors), Dempster’s rule is 
fully consistent with Bayes fusion rule. So the claim that 
Dempster’s is a generalization of Bayes rule is true in this 
very particular case only, and that is why such claim has been 
widely used to defend Dempster’s rule and DST thanks to its 
compatibility with Bayes fusion rule in that very particular 
case. Unfortunately, such compatibility is only partial and not 
general because it is not longer valid when considering the 
more general cases involving non uniform Bayesian prior bba’s 
as shown in the next subsection. 

B. Case 2: Non uniform Bayesian prior 

Let us consider Dempster’s fusion of Bayesian bba’s with 
a Bayesian non uniform prior mo (.). In such case it is easy 
to check from the general structures of Bayes fusion rule 
(16) and Dempster’s fusion rule (33) that these two rules are 
incompatible. Indeed, in Bayes rule one divides each posterior 
source nii(xj) by y/rn^Xj), i = 1, 2, ... s, whereas the prior 
source mo (.) is combined in a pure conjunctive manner by 
Dempster’s rule with the bba’s TOj(.), i = 1, 2, ... s, as if mo (.) 
was a simple additional source. This difference of processing 
prior information between the two approaches explains clearly 
the incompatibility of Dempster’s rule with Bayes rule when 
Bayesian prior bba is not uniform. This incompatibility is 
illustrated in the next simple example. Mahler and Fixsen 
have already proposed in [23], [24], [25] a modification of 
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Dempster’s rule to force it to be compatible with Bayes 
rule when combining Bayesian bba’s. The analysis of such 
modified Dempster’s rule is out of the scope of this paper. 

Example 4: Let us consider the same frame 0 ( X ) , and same 
bba’s toi(.) and m 2 (.) as in the Example 3. Suppose that 
the prior information is Bayesian and non uniform as follows: 

mo(xi) = P(X = xi) = 0.6, mo(x2) = P(X = X 2 ) = 0.3 
and 7710(23) = P(X = 23) = 0.1. Applying Bayes rule (12) 
yields: 



(p(x 1 \z 1 nz 2 ) 
l p{x 2 \Zi nz 2 ) 

\p{x3\Zi nz 2 ) 



A 2 (x 1 ) 0. 2 0. 5/0. 6 0.1667 

GA 2 2.2667 2.2667 

A 2 (x 2 ) _ 0. 3 0. 1/0. 3 _ 0.1000 
GA 2 ~ 2.2667 — 2.2667 

A 2 (x 3 ) _ 0. 5 0. 4/0.1 _ 2.0000 
GA 2 2.2667 2.2667 



0.0735 

0.0441 

0.8824 



Applying Dempster’s rule yields rriDsixi) 7^ P(%i\Zi n Z 2 ) 
because: 

( m DS {x 1 ) = T 3o^ T o • 0.2 • 0.5 • 0.6 = M|§ « 0.6742 
\m D s(x2) = i^cTono- °' 3 ' O' 1 ' °' 3 = ~ 0-1011 

l"*^(23) = T^okm- 0.5 -0.4. 0.1 = 0.2247 



Therefore, one has in general 6 : 



DS(rm(. ),..., m s (.);m 0 (.)) ^ 

Bayes(P(X \Z {), . . . , P{X\Z S )-P{X)) (38) 



V. Conclusions 

In this paper, we have analyzed in details the expression 
and the properties of Bayes rule of combination based on 
statistical conditional independence assumption, as well as the 
emblematic Dempster’s rule of combination of belief functions 
introduced by Shafer in his Mathematical Theory of evidence. 
We have clearly explained from a theoretical standpoint, and 
also on simple examples, why Dempster’s rule is not a gen- 
eralization of Bayes rule in general. The incompatibility of 
Dempster’s rule with Bayes rule is due to its impossibility to 
deal with non uniform Bayesian priors in the same manner 
as Bayes rule does. Dempster’s rule turns to be compatible 
with Bayes rule only in two very particular cases: 1) if all the 
Bayesian bba’s to combine (including the prior) focus on same 
state (i.e. there is a perfect conjunctive consensus between the 
sources), or 2) if all the bba’s to combine (excluding the prior) 
are Bayesian, and if the prior bba cannot help to discriminate a 
particular state of the frame of discernment (i.e. the prior bba is 
either vacuous, or Bayesian and uniform). Except in these two 
very particular cases, Dempster’s rule is totally incompatible 
with Bayes rule. Therefore, Dempster’s rule cannot be claimed 
to be a generalization of Bayes fusion rule, even when the bba’s 
to combine are Bayesian. 



Acknowledgment 

This study was co-supported by Grant for State Key Pro- 
gram for Basic Research of China (973) (No. 2013CB329405), 
National NSF of China (No.61 104214, No. 61203222), and 
also partly supported by the project AComln, grant 316087, 
funded by the FP7 Capacity Programme. 



References 

[1] L.A. Zadeh, On the validity of Dempster’s rule of combination, Memo 
M79/24, Univ. of California, Berkeley, CA, U.S.A., 1979. 

[2] A. Dempster, Upper and lower probabilities induced by a multivalued 
mapping, Ann. Math. Statist., Vol. 38, pp. 325-339, 1967. 

[3] A. Dempster, A generalization of bayesian inference, J. R. Stat. Soc. B 
30, pp. 205-247, 1968. 

[4] G. Shafer, A Mathematical theory of evidence, Princeton University 
Press, Princeton, NJ, U.S.A., 1976. 

[5] L.A. Zadeh, Book review: A mathematical theory of evidence, The A1 
Magazine, Vol. 5, No. 3, pp. 81-83, 1984. 

[6] L.A. Zadeh, A simple view of the Dempster-Shafer theory of evidence 
and its implication for the rule of combination, The A1 Magazine, Vol. 
7, No. 2, pp. 85-90, 1986. 

[7] J. Lemmer, Confidence factors, empiricism and the Dempster-Shafer 
theory of evidence, Proc. of UAI ’85, pp. 160-176, Los Angeles, CA, 
U.S.A., July 10-12, 1985. 

[8] J. Pearl, Do we need higher-order probabilities, and, if so, what do they 
mean?, Proc. of UAI ’87, pp. 47-60, Seattle, WA, U.S.A., July 10-12, 

1987. 

[9] F. Voorbraak, On the justification of Demspster’s rule of combination, 
Dept, of Phil., Utrecht Univ., Logic Group, Preprint Ser., No. 42, Dec. 

1988. 

[10] G. Shafer, Perspectives on the theory and practice of belief functions, 
IJAR, Vol. 4, No. 5-6, pp. 323-362, 1990. 

[11] J. Pearl, Reasoning with belief functions: an analysis of compatibility, 
IJAR, Vol. 4, No. 5-6, pp. 363-390, 1990. 

[12] J. Pearl, Rejoinder of comments on “Reasoning with belief functions: 
An analysis of compatibility”, IJAR, Vol. 6, No. 3, pp. 425^443, May 
1992. 

[13] G.M. Provan, The validity of Dempster-Shafer belief functions, IJAR, 
Vol. 6., No. 3, pp. 389-399, May 1992. 

[14] P. Wang, A defect in Dempster-Shafer theory, Proc. of UAI ’94, pp. 
560-566, Seattle, WA, U.S.A., July 29-31, 1994. 

[15] A. Gelman, The boxer, the wrestler, and the coin flip: a paradox of 
robust bayesian inference and belief functions, American Statistician, 
Vol. 60, No. 2, pp. 146-150, 2006. 

[16] F. Smarandache, J. Dezert (Editors), Applications and advances of 
DSmT for information fusion, Vol. 3, ARP, U.S.A., 2009. 

http ://f s . gallup . unm . edu/DS mT. htm 

[17] J. Dezert, P. Wang, A. Tchamova, On the validity of Dempster-Shafer 
theory, Proc. of Fusion 2012 Int. Conf., Singapore, July 9-12, 2012. 

[18] A. Tchamova, J. Dezert, On the Behavior of Dempster’s rule of 
combination and the foundations of Dempster-Shafer theory, (best paper 
award), Proc. of 6th IEEE Int. Conf. on Intelligent Systems IS ’12, Sofia, 
Bulgaria, Sept. 6-8, 2012. 

[19] R.P. Mahler, Statistical multisource-multitarget information fusion, 
Chapter 4, Artech House, 2007. 

[20] X.R. Li, Probability, random signals and statistics, CRC Press, 1999. 

[21] PE. Pfeiffer, Applied probability. Connexions Web site, Aug. 31, 2009. 
http://cnx.org/content/coll0708/L6/ 

[22] G. Shafer, Non-additive probabilities in the work of Bernoulli and 
Lambert, in Archive for History of Exact Sciences, C. Truesdell (Ed.), 
Springer- Verlag, Berlin, Vol. 19, No. 4, pp. 309-370, 1978. 

[23] R.P. Mahler, Using a priori evidence to customize Dempster-Shafer 
theory, Proc. of 6th Nat. Symp. on Sensor Fusion, Vol. 1, pp. 331 — 
345, Orlando, FL, U.S.A., April 13-15, 1993. 

[24] R.P. Mahler, Classification when a priori evidence is ambiguous, Proc. 
SPIE Conf. on Opt. Eng. in Aerospace Sensing (Automatic Object 
Recognition IV), pp. 296-304, Orlando, FL, U.S.A., April 4-8, 1994. 

[25] D. Fixsen, R.P. Mahler, The modified Dempster-Shafer approach to 
classification, IEEE Trans, on SMC, Part A, Vol. 27, No. 1, pp. 96-104, 
1997. 



6 but in the very degenerate case when manipulating deterministic Bayesian 
bba’s, which is of little practical interest from the fusion standpoint. 



202 



Advances and Applications of DSmT for Information Fusion. Collected Works. Volume 4 



On the Consistency of PCR6 with the Averaging Rule and 
its Application to Probability Estimation 

Florentin Smarandache 
Jean Dezert 



Originally published as Smarandache F„ Dezert J., On the consistency ofPCR6 with the averaging rule and 
its application to probability estimation, Proc. ofFusion 2013 Int. Conference on Information Fusion, 
Istanbul, Turkey, July 9-12, 2013, and reprinted with permission, (with typos corrections). 



Abstract — Since the development of belief function theory 
introduced by Shafer in seventies, many combination rules have 
been proposed in the literature to combine belief functions 
specially (but not only) in high conflicting situations because 
the emblematic Dempster’s rule generates counter-intuitive and 
unacceptable results in practical applications. Many attempts 
have been done during last thirty years to propose better rules 
of combination based on different frameworks and justifications. 
Recently in the DSmT (Dezert-Smarandache Theory) frame- 
work, two interesting and sophisticate rules (PCR5 and PCR6 
rules) have been proposed based on the Proportional Conflict 
Redistribution (PCR) principle. These two rules coincide for the 
combination of two basic belief assignments, but they differ in 
general as soon as three or more sources have to be combined 
altogether because the PCR used in PCR5 and in PCR6 are 
different. In this paper we show why PCR6 is better than PCR5 
to combine three or more sources of evidence and we prove 
the coherence of PCR6 with the simple Averaging Rule used 
classically to estimate the probability based on the frequentist 
interpretation of the probability measure. We show that such 
probability estimate cannot be obtained using Dempster-Shafer 
(DS) rule, nor PCR5 rule. 

Keywords: Information fusion, belief functions, PCR6, 
PCR5, DSmT, frequentist probability. 

I. Introduction 

In this paper, we work with belief functions [1] defined 
from the finite and discrete frame of discernment 0 = 
{9i, 02, • ■ • , 0 n }. In Dempster-Shafer Theory (DST) frame- 
work, basic belief assignments (bba’s) provided by the dis- 
tinct sources of evidence are defined on the fusion space 
2® = (0, U) consisting in the power-set of 0, that is the set 
of elements of 0 and those generated from 0 with the union 
set operator. Such fusion space assumes that the elements of 
0 are non-empty, exhaustive and exclusive, which is called 
Shafer’s model of 0. More generally, in Dezert-Smarandache 
Theory (DSmT) [2], the fusion space denoted G® can also 
be either the hyper-power set D ® = (0, U, fl) (Dedekind’s 
lattice), or super-power set 1 S® = (0, U, fl, c(.)) depending on 
the underlying model of the frame of discernment we choose 
to fit with the nature of the problem. Details on DSm models 
are given in [2], Vol. 1. 

We assume that s > 2 basic belief assignments (bba’s) 
TOi(.), i = 1,2, ...,s provided by s distinct sources of 
evidences defined on the fusion space G® are available and 
we need to combine them for a final decision-making purpose. 

'n and c(.) are respectively the set intersection and complement operators. 



For doing this, many rules of combination have been proposed 
in the literature, the most emblematic ones being the simple 
Averaging Rule, Dempster-Shafer (DS) rule, and more recently 
the PCR5 and PCR6 fusion rules. 

The contribution of this paper is to analyze in deep the 
behavior of PCR5 and PCR6 fusion rules and to explain why 
we consider more preferable to use PCR6 rule rather than 
PCR5 rule for combining several distinct sources of evidence 
altogether. We will show in details the strong relationship be- 
tween PCR6 and the averaging fusion rule which is commonly 
used to estimate the probabilities in the classical frequentist 
interpretation of probabilities. 

This paper is organized as follows. In section II, we 
briefly recall the background on belief functions and the main 
fusion rules used in this paper. Section III demonstrates the 
consistency of PCR6 fusion rule with the Averaging Rule 
for binary masses in total conflict as well as the ability of 
PCR6 to discriminate asymmetric fusion cases for the fusion 
of Bayesian bba’s. Section IV shows that PCR6 can also 
be used to estimate empirical probability in a simple (coin 
tossing) random experiment. Section V will conclude and 
open challenging problem about the recursivity of fusion rules 
formulas that are sought for efficient implementations. 

II. Background on belief functions 
A. Basic belief assignment 

Lets’ consider a finite discrete frame of discernment 0 = 
{0 1 , 02 ) • ■ • , 0 n }, n > 1 of the fusion problem under considera- 
tion and its fusion space G® which can be chosen either as 2®, 
D e 

or 5® depending on the model that fits with the problem. 
A basic belief assignment (bba) associated with a given source 
of evidence is defined as the mapping m(.) : G® — » [0, 1] 
satisfying m(0) = 0 and 4gG e m (A) = 1. The quantity 
m{A) is called mass of belief of A committed by the source 
of evidence. If m(A) > 0 then A is called a focal element 
of the bba m(.). When all focal elements are singletons and 
G® = 2® then m(.) is called a Bayesian bba [1] and it is 
homogeneous to a (possibly subjective) probability measure. 
The vacuous bba representing a totally ignorant source is 
defined as m v (Q) = 1. Belief and plausibility functions are 
defined by 

Bel(A) = "^2 tn(B) and P1(A) = m(B) (1) 

BCA BnA/0 

BeG e B&G e 
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B. Fusion rules 

The main information fusion problem in the belief function 
frameworks (DST or DSmT) is how to combine efficiently 
several distinct sources of evidence represented by 
m 2 (•), . .., m s (.) (s > 2) bba’s defined on G. Many rules 
have been proposed for such task - see [2], Vol. 2, for a 
detailed list of fusion rules - and we focus here on the 
following ones: 1) the Averaging Rule because it is the simplest 
one and it is used to empirically estimate probabilities in 
random experiment, 2) DS rule because it was historically 
proposed in DST, and 3) PCR5 and PCR6 rules because they 
were proposed in DSmT and have shown to provide better 
results than the DS rule in all applications where they have 
been tested so far. So we just briefly recall how these rules are 
mathematically defined. 

• Averaging fusion rule v 2 era9 s e {-) 



For any X in G e , mf V 2 raa s e {X) is defined by 



1 s 

m^2 e ™f e (X) = Average(mi,m2, ■ ■ ■ ,m s ) — - ^ rrii(X) 



*= 1 



( 2 ) 

Note that the vacuous bba m„(0) = 1 is not a neutral element 
for this rule. This Averaging Rule is commutative but it is not 
associative because in general 

mAverage^x) = I[ mi (JQ + m 2 (X) + m 3 (X)] 

is different from 

Average/ v \ 1 r rtti (2f ) T 77i2 (2f ) / y\i 

"*( 1 , 2 ), 3 ( X ) = ot 5 +m 3 (X)] 



which is also different from 



% f^(X) = l[m 1 (A) + 

and also from 



1, 



m 2 (X) + m 3 (X) 



Average f 1 p (A") A 77^3 (2f ) p 

"*(1,3), 2 ( A ) = o + 2 



In fact, it is easy to prove that the following recursive formula 
holds 



Averaqe 

"* 1 , 2 ,...,“ 



(X) 



— rntZT-iW + - m '(X) ( 3 ) 



This simple averaging fusion rule has been used since more 
than two centuries for estimating empirically the probability 
measure in random experiments [3], [4]. 



• Dempster-Shafer fusion rule mf* f s (.) 

In DST framework, the fusion space G e equals the power- 
set 2 e because Shafer’s model of the frame 0 is assumed. 
The combination of s > 2 distinct sources of evidences 
characterized by the bba’s mj(.), i = 1, 2, . . . , s, is done with 
DS rule as follows [1]: ... s (0) = 0 and for all X ^ 0 in 

2 e 



i DS 

n, 2 ,. 



„.P0 = 



K i,: 



E 

X 1 ,X 2 ,...,X s e2 
x 1 nx 2 r\...nx a =x 



(4) 



© i = 1 



where the numerator of (4) is the mass of belief on the conjunc- 
tive consensus on X, and where K\ t 2,...,s is a normalization 
constant defined by 

S 

Ki,2 ,...,s= E \\_mi(Xi) = 1 - m 1 ,2,..., s (0) 

X 1 ,X 2 ,...,X a G2 e i=1 

x 1 nx 2 n...nx a ^@ 

The total degree of conflict between the s sources of evidences 
is defined by 

S 

"*l,2,...,s(0) = e n rrii(Xi) 

x 1 ,x 2 ,...,x a e 2 0 i=1 
x 1 nx 2 n...nx a =H) 

The sources are said in total conflict when mi i 2,..., s (0) = 1. 

The vacuous bba rn v ( 0 ) = 1 is a neutral element for DS 
rule and DS rule is commutative and associative. It remains 
the milestone fusion rule of DST. The doubts on the validity 
of such fusion rule has been discussed by Zadeh in 1979 
[5]— [7] based on a very simple example with two highly 
conflicting sources of evidence. Since 1980’s, many criticisms 
have been done about the behavior and justification of such 
DS rule. More recently, Dezert et al. in [8], [9] have put 
in light other counter-intuitive behaviors of DS rule even in 
low conflicting cases and showed serious flaws in logical 
foundations of DST. 



• PCR5 and PCR6 fusion rules 



To work in general fusion spaces G e and to provide better 
fusion results in all (low or high conflicting) situations, several 
fusion rules have been developed in DSmT framework [2]. 
Among them, two fusion rules called PCR5 and PCR6 based 
on the proportional conflict redistribution (PCR) principle have 
been proved to work efficiently in all different applications 
where they have been used so far. The PCR principle transfers 
the conflicting mass only to the elements involved in the 
conflict and proportionally to their individual masses, so that 
the specificity of the information is entirely preserved. 

The general principle of PCR consists: 

1) to apply the conjunctive rule; 

2) calculate the total or partial conflicting masses; 

3) then redistribute the (total or partial) conflicting mass 
proportionally on non-empty sets according to the 
integrity constraints one has for the frame 0. 

Because the proportional transfer can be done in two different 
ways, this has yielded to two different fusion rules. The PCR5 
fusion rule has been proposed by Smarandache and Dezert in 
[2], Vol. 2, Chap. 1, and PCR6 fusion rule has been proposed 
by Martin and Osswald in [2], Vol. 2, Chap. 2. 

We will not present in deep these two fusion rules since 
they have already been discussed in details with many exam- 
ples in the aforementioned references but we only give their 
expressions for convenience here. 
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The general formula of PCR5 for the combination of s > 2 
sources is given by mf^ fl5 s (0) = 0 and for X ^ 0 in G e 



PCR5 



(X)=mi, 2 ,...,s(X)+ 

E 

2 <t<s 
l<ri 



E 



X j 2 „.. ) X jt eG 6 , \{A-} 



l<r 1 <r a <...<r t _ 1 <(r 4 =a) ^’xnX^.nX^'^ 

(iifcUi m <*i ffl 2 ) • [nL 2 (n;u n _ 1+ i m iki (x n )} 

01^=1 m iki (X)) + E Un£U_ 1+ i m*. (EJ] 



(5) 



The PCR5 and PCR6 fusion rules simplify greatly and 
coincide for the combination of two sources (s = 2). In such 
simplest case, one always gets the resulting bba mpcR^/ei-) = 
m i, 2 R6 (-) ~ m i, 2 R5 (-) expressed as m PC R^/Q{0) = 0 and 
for all X ^ 0 in G e 

m PCR5/6{X) = mi(Xi)m 2 (X 2 ) + 

X 1 ,X 2 £G e 

x 1 nx 2 =x 

y- m 1 (X) 2 m 2 {Y) w 2 (X) 2 mi(F) 

Y£gK{x} + m2 ( F ) to 2 (X) + mi(F) J 

xnr=0 



where i, j, k, r, s and t in (5) are integers. mi i2> ... iS (X) 
corresponds to the conjunctive consensus on X between 
s sources and where all denominators are different from 
zero. If a denominator is zero, that fraction is discarded; 
V k ({l 1 2, . . . , n}) is the set of all subsets of k elements from 
{1,2, ...,n} (permutations of n elements taken by k), the 
order of elements doesn’t count. 

The general formula of PCR6 proposed by Martin and 
Osswald for the combination of s > 2 sources is given by 
mR< 2 R6 s( 0) = 0 and for X ^ 0 in G° 



^wW=m li2 ,..., s (X)+ 



E m >( J ) 2 



E 



s — 1 

(r <T . ( 1 ) ,...,r CT . ta _ 1 ) )e(G e ) s - 1 

/ 8-1 
II m<T i(j) 

1=1 



J 7 lj(X) + ^ ^ m (Ti ( j ) O^CTi(j)) 
\ 1 = 1 / 



(6) 



where Ui counts from 1 to s avoiding i: 

cr iti)=J if j<i, 
+ 1 if j>i, 



(7) 



Since Y r is a focal element of expert/source i, 

s-l 

m,i(X)+ E (^CT.d)) 0- 

l=i 



The general PCR5 and PCR6 formulas (5)-(6) are 
rather complicate and not very easy to understand. From 
the implementation point of view, PCR6 is much simple 
to implement than PCR5. For convenience, very basic (not 
optimized) Matlab codes of PCR5 and PCR6 fusion rules can 
be found in [2], [10] and from the toolboxes repository on the 
web [11]. The PCR5 and PCR6 fusion rules are commutative 
but not associative, like the averaging fusion rule, but the 
vacuous belief assignment is a neutral element for these PCR 
fusion rules. 



where all denominators in (8) are different from zero. 
If a denominator is zero, that fraction is discarded. All 
propositions/sets are in a canonical form. 



Example 1: See [2], Vol.2, Chap. 1 for more examples. 



Let’s consider the frame of discernment 0 = { A , B} of 
exclusive elements. Here Shafer’s model holds so that G e = 
2 e = {0, A, B,AUB}. We consider two sources of evidences 
providing the following bba’s 

mi (A) = 0.6 mi(B) = 0.3 mi(AllB) = 0.1 

m 2 ( A) = 0.2 m 2 (.B) = 0.3 m 2 (vlUB) =0.5 
Then the conjunctive consensus yields : 

m 12 (A) = 0.44 to 12 (H) = 0.27 m 12 (A U B) = 0.05 
with the conflicting mass 

m 12 (A nB = 0) = mi(A)m 2 (H) + mi(H)m 2 (A) 

= 0.18 + 0.06 = 0.24 



One sees that only A and B are involved in the derivation 
of the conflicting mass, but not A U B. With PCR5/6, one 
redistributes the partial conflicting mass 0.18 to A and B 
proportionally with the masses m\(A) and m 2 (H) assigned 
to A and B respectively, and also the partial conflicting mass 
0.06 to A and B proportionally with the masses m 2 ( A ) and 
roi(-B) assigned to A and B respectively, thus one gets two 
weighting factors of the redistribution for each corresponding 
set A and B respectively. Let x \ be the conflicting mass to be 
redistributed to A, and i/i the conflicting mass redistributed to 
B from the first partial conflicting mass 0.18. This first partial 
proportional redistribution is then done according 

_ yi_ _ x i +yi __ 0T{1 
0.6 ~ 0.3 “ 0.6 + 0.3 0.9 

whence a;i = 0.6 • 0.2 = 0.12, y\ = 0.3 • 0.2 = 0.06. Now 
let x 2 be the conflicting mass to be redistributed to A, and 
i /2 the conflicting mass redistributed to B from the second the 
partial conflicting mass 0.06. This second partial proportional 
redistribution is then done according 

x 2 _ V 2 _ X 2 + y 2 _ 0-06 _ 

0.2 ~ 0.3 ~ 0.2 + 0.3 0.5 
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whence X 2 = 0.2 • 0.12 = 0.024, 7/2 = 0.3 ■ 0.12 = 0.036. 
Thus one finally gets: 

m P cR5/ 6 (.A) = 0.44 + 0.12 + 0.024 = 0.584 
m PC R5/6{B) = 0.27 + 0.06 + 0.036 = 0.366 
m PCR5/6(A U B) = 0.05 + 0 = 0.05 

• The difference between PCR5 and PCR6 fusion rules 



For the two sources case, PCR5 and PCR6 fusion rules 
coincide. As soon as three (or more) sources are involved 
in the fusion process, PCR5 and PCR6 differ in the way the 
proportional conflict redistribution is done. For example, let’s 
consider three sources with bba’s mi(.), TO2(.) and m 3(.), 
4flB = 0 for the model of the frame 0, and mi (A) = 0.6, 
7712(5) = 0.3, 7713(5) = 0.1. 

- With PCR5, the mass m 1 (A)m 2 (S)m 3 (5) = 0.6-0.3-0.1 = 
0.018 corresponding to a conflict is redistributed back to A and 
5 only with respect to the following proportions respectively: 
X PCR5 = 0.01714 and x^ CR5 = 0.00086 because the 
proportionalization requires 



rPCR5 



„PCR5 



mi (A) 777-2 (5)777.3 (5) 



mi(A)m 2 (5)777.3 (5) 
mi (A) + to 2 (5)to 3 (5) 



that is 



~,PCR5 „PCR5 
X A _ X B 

0.6 “ 0.03 



0.018 
0.6 + 0.03 



0.02857 



Thus 

r X PCR5 = 0 60 . 0.02857 « 0.01714 
i x PfK> = 0.03 • 0.02857 ~ 0.00086 



- With the PCR6 fusion rule, the partial conflicting mass 
mi(A)m 2 (5)m3(5) = 0.6 • 0.3 • 0.1 = 0.018 is redistributed 
back to A and 5 only with respect to the following proportions 
respectively: x Rcm = 0.0108 and Xg Cm = 0.0072 because 
the PCR6 proportionalization is done as follows: 



„PCR6 



yPCR6 



mi (A)m 2 (5)m 3 (5) 



mi(A) m 2 (5)+ 777,3(5) mi(A) + (m 2 (5) + m 3 (5)) 



that is 



„PCR6 
x A 

0.6 



„PCR6 

x B 



0.018 



0.3 + 0.1 0.6 + (0.3 + 0.1) 



= 0.018 



and therefore with PCR6, one gets finally the following 
redistributions to A and 5: 



.PCR6 = 0 6 . 0 018 = 0.0108 
.PCR6 = (0.3 + o.l) • 0.018 = 0.0072 



In [2], Vol. 2, Chap. 2, Martin and Osswald have proposed 
PCR6 based on intuitive considerations and the authors have 
shown through simulations that PCR6 is more stable than 
PCR5 in term of decision for combining s > 2 sources of 
evidence. Based on these results and the relative ’’simplicity” 
of implementation of PCR6 over PCR5, PCR6 has been 
considered more interesting/efficient than PCR5 for combining 
3 (or more) sources of evidences. 



III. Consistency of PCR6 with the Averaging Rule 

In this section we show why we also consider PCR6 
as better than PCR5 for combining bba’s. But here, our 
argumentation is not based on particular simulation results 
and decision-making as done by Martin and Osswald, but on 
a theoretical analysis of the structure of PCR6 fusion rule 
itself. In particular, we show the full consistency of PCR6 rule 
with the averaging fusion rule used to empirically estimate 
probabilities in random experiments. For doing this, it is 
necessary to simplify the original PCR6 fusion formula (6). 
Such simplification has already been proposed in [12] and the 
PCR6 fusion rule can be in fact rewritten as 

77+2, =7711, 2... .,3(20 + 

s — 1 

EE E 

k=1 x il ,x i2 ,...,x ik eG e \x (< 1,12 k)et“({i,-,«}) 

(nJ = i+,-)nx=0 

[mi! (X) + m i2 (X) + . . . + m ik (X)]- 
. m + i x ) ■ ■ ■ m i k P Qm+ +1 (X ik+1 )...m is (X is ) 

777-ii (X) + . . . + uii k (X) + rtii k+1 (Xi k+1 ) + ... + m ie (X is ) 

(9) 

where 5 S ({1, . . . , s}) is the set of all permutations of 
the elements {1,2, It should be observed that X,, , 
X, 2 ,. . .,X, ( may be different from each other, or some of them 
equal and others different, etc. 

We wrote this PCR6 general formula (9) in the style of 
PCR5, different from Arnaud Martin & Christophe Oswald’s 
notations, but actually doing the same thing. In order not 
to complicate the formula of PCR6, we did not use more 
summations or products after the third Sigma. 



We now are able to establish the consistency of general 
PCR6 formula with the Averaging fusion rule for the case of 
binary bba’s through the following theorem 1 . 

Theorem 1: When s > 2 sources of evidences provide binary 
bba’s on G e whose total conflicting mass is 1, then the PCR6 
fusion rule coincides with the averaging fusion rule. Otherwise, 
PCR6 and the averaging fusion rule provide in general different 
results. 

Proof 1: All s > 2 bba’s are assumed binary, i.e. m(X) = 0 
or 1 (two numerical values 0 and 1 only are allowed) for any 
bba m(.) and for any set X in the focal elements. A focal 
element in this case is an element X such that at least one of 
the s binary sources assigns a mass equals to 1 to X. Let’s 
suppose the focal elements are F\, +2,..., F n .. Then the set 
of bba’s to combine can be expressed as in the Table I. where 



Table I. LIST OF BBA’S TO COMBINE. 



bba’s \ Focal elem. 


Fl 


F 2 




F n 


mi(.) 


★ 


★ 




★ 


m 2 (■) 


★ 


★ 




★ 












m B {.) 


★ 


★ 




★ 



• all * are 0’s or l’s; 

• on each row there is only a 1 (since the sum of 
all masses of a bba is equal to 1) and all the other 
elements are 0’s; 
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• also each column has at least an 1 (since all elements 
are focals; and if there was a column corresponding 
for example to the set F p having only 0’s, then it 
would result that the set F p is not focal, i.e. that all 

m(F p ) = 0). 



Using PCR6, we first need to apply the conjunctive rule 
to all s sources, and the result is a sum of products of the 
form mi(Xi)m 2 {X 2 ) ■ ■ ■ m s (X s ) where Xi, X 2 ,...,X S , are 
the focal elements F\, F- 2 ,. . . ,F n in various permutations, with 
s > n. If s > n some focal elements X, are repeated in 
the product mi(A'i)m 2 (Ai 2 ) . . . m s (X s ). But there is only one 
product of the form mi(Xi)ro 2 (Ai 2 ) ■ ■ ■ m s (X s ) = 1 which 
is not equal to zero, i.e. that product which has each factor 
equals to ”1” (i.e. the product that collects from each row the 
existing single 1). Since the total conflicting mass is equal to 
1, it means that this product represents the total conflict. In 
this case the PCR6 formula (9) becomes 



PCR 6 ( \r\ /-* | 

m l ) — 0+ 



EE E 

fe=1 x u eG e \x (*i, >2, ■••,«}) 

‘1’ ‘fe ' 



(n?=i*i,-)nx =0 



[1 + 1 + ... + 1 ] 






1 + 1 + ... + 1 + 1 + ... + 1 



( 10 ) 



The previous expression can be rewritten as 



pcm 
m 12 






E 

X i 1 , x i 2 ,--, x i k eG e \x 
(C$=iXq)nx=0 



E 

(*1, *2, ■■■,»*,) 
G-P s ({l,..., a }) 




which is equal to k/s since there is only one possible non- 
null product of the form mi(Xi)m 2 (X 2 ) ■ ■ .m s (X s ), and all 
other products are equal to zero. Therefore, we finally get: 

<a™A x ) = - (ID 

s 

where ”k” is the number of bba’s m{.) which give m{X) = 1. 
Therefore PCR6 in this case reduces to the average of masses, 
which completes the proof 1 of the theorem. 

Proof 2: A second method of proving this theorem can also be 
done as follows. Let mi(.), m 2 (.), ..., m s (.), for s > 3, be 
bba’s of the sources of information to combine and denote T = 
{ F\ , _F 2 , ■ ■ ■ , F n }, for n > 2, the set of all focal elements. All 
sources give only binary masses, i.e. mk{F{) = 0 or rrik{Fi) = 
1 for any k £ {1, 2, . . . , s} and any l £ {1,2,..., n}. Since 
each Fi, 1 < i < n, is a focal element, there exists at least 
a bba rrii 0 {.) such that rrii 0 (Fi) = 1, otherwise (i.e. if all 
sources gave the mass of Fi be equal to zero) Fi would not be 
focal. Without reducing the generality of the theorem, we can 
regroup the masses (since we combine all of them at once, so 
their order doesn’t matter), as in Table II. Of course i-\ + i 2 + 
. . . + i n = s, since the s bba’s are the same but reordered, and 
* i > 1, *2 > 1, ..., and i n > 1. The total conflicting mass 
according to the theorem hypothesis mi i 2,..., s (0) is 1. With 
the PCR6 fusion rule we transfer the conflict mass back to 
focal elements Fi, F 2 , . . . F n respectively according to PCR 



Table n. LIST OF REORDERED BINARY BBA’S. 



bba’s \ Focal elem. 


Fi 


f 2 




Fn 


0 


m ri (.) 


1 


0 




0 


0 


m r2 (.) 


1 


0 




0 


0 


m r u (■) 


1 


0 




0 


0 


m s 1 (■) 


0 


1 




0 


0 


m s 2 (■) 


0 


1 




0 


0 


m +, (') 


0 


1 




0 


0 














m ui (.) 


0 


0 




1 


0 


m u 2 (■) 


0 


0 




1 


0 




0 


0 




1 


0 


mi, 2 s ( ■) 


0 


0 




0 


1 



principle such that: 



x F 1 Xf 2 

1 + 1 + .. . + 1 = 1 + 1 + .. . + 1 = “■ 

S. v ✓ S. v ✓ 

i\ times 22 times 

_ x F n _ mi, 2 ,...,s(0) _ 1 

1 + 1 + .. . + 1 Zi+i2 + ***+^n s 

i n times 



whence xf x = ii/s, Xf 2 = ii/s, ..., XF n = in/ s. 
Therefore m^ i? ® s (Fi) = i\/s, (F 2 ) = * 2 / 5 , 

. . . mf^' fl6 s (F ra ) = i n /s. But averaging the masses m i(.), 
to 2 (.), ..., m s (.) is equivalent to averaging each column of 
Fi, + 2 , . . . F n . Hence average of column F-\ is i-\ / s, average 
of column +2 is / s, ..., average of column F n is i n /s. 

Therefore, in case of binary bba’s which are globally totally 
conflicting, PCR6 rule is equal to the Averaging Rule. This 
completes the proof 2 of the theorem. 

Note that using PCR5 fusion rule, we also transfer the 
total conflicting mass that is equal to 1 to F\, F 2 , ■■■, 
F n respectively, but we replace the addition ”+” with the 
multiplication ■” in the above proportionalizations: 

XF 1 _ x F 2 _ _ XF n _ mi, 2 s (0) _ 1 

1 ■ 1 ■ . . . ■ 1 ~ 1 ■ 1 ■ . . . • 1 _ _ 1 ■ 1 ■ . .. ■ 1 ~ 1 + 1 + .. . + 1 ~ n 

V ^ / V v / V v y v V / 

times 22 times i n times n times 



so that xf 1 = 1 /n, xp 2 = 1 /n, . . . , xp n = 1/n and therefore 



m 



PCRb 

1,2 



(Fi) 



= m 



PCR5 

1,2 



(f 2 ) = • • • 



= m 



PCR5 
1,2, ...,s 



{F n ) = 1 /n 



Corollary 1: When s > 2 sources of evidences provide binary 
bba’s on G e with at least two focal elements, and all focal 
elements are disjoint two by two, then PCR6 fusion rule 
coincides with the Averaging Rule. 

This Corollary is true because if all focal elements are 
disjoint two by two then the total conflict is equal to 1. 



Examples 2: where PCR6 rule equals the Averaging Rule. 

Let’s consider the frame 0 = {A 1 B} with Shafer’s model 
and the bba’s to combine as given in Table III. 
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Table III. LIST OF BBA’S TO COMBINE FOR EXAMPLE 2. 



bba’s \ Focal elem. 


A 


B 


A U B 


A n B = 0 


m i(.) 


1 


0 


0 




m 2 (-) 


0 


1 


0 




”»s(0 


0 


0 


1 




mi,2,3(.) 


0 


0 


0 


1 



Since we have binary masses, and their total conflict is 1, we 
expect getting the same result for PCR6 and the Averaging 
Rule according to our Theorem 1 . The PCR principle gives us 

XA_ __ VB_ _ ZAUB _ mi, 2, 3(0) _ 1 
1 _ 1 “ 1 “ 1 + 1 + 1 “ 3 

Hence xa=Vb = zaub = §, so that 

m i,2A 6 (A) = mi, 2, 3(A) + iea = 0 + — = — 

= mi, 2 , 3(B) + y B = 0 + | = | 

m i,2,3^ 6 {A U B) = 7711,2,3(^1 U B) + zaub — 0 + — = — 

Interestingly, PCR5 gives the same result as PCR6 in this case 
since one makes the same proportionalizations as for PCR6. 
Using the Averaging Rule (2), we get 

m tflT a \A) = |.(1 + 0T0) = | 

mt: 2 T 9e {B) = \-{ 0 + i + 0) = | 
mf 2 J^(Aui3) = i.(0 + 0 + l) = i 

So we see that PCR6 rule equals the Averaging Rule 
as proved in the theorem because the bba’s are binary 
and the intersection of all focal elements is empty since 
AnHn(AUB) = 0n(AUH) = 0 because A n B = 0 
since Shafer’s model has been assumed for the frame 0. 

Examples 3: where PCR6 differs from the Averaging Rule. 

Let’s consider the frame 0 = {A 1 B, C} with Shafer’s 
model and the bba’s to combine as given in Table IV. 



Table IV. LIST OF BBA'S TO COMBINE FOR EXAMPLE 3. 



bba’s \ Focal elem. 


A 


AUB 


AUBUC 


0 


mi(.) 


1 


0 


0 




m 2 (•) 


0 


1 


0 




m 3 (.) 


0 


0 


1 




m 1 , 2 , 3(0 


1 


0 


0 





Clearly, in this case the focal elements are nested and the 
condition on emptiness of intersection of all focal elements is 
not satisfied because one has A n {A U B) IT (A U B U C) = 
A ^ 0, so that the theorem cannot be applied in such case. The 
total conflicting mass is not 1. One can verify in such example 
that PCR6 rule differs from the Averaging Rule because one 
gets 

m?Z?{A) = mi, 2 , 3 (A) = 1 

m i,2,3 6 (A U B) = mi, 2, 3 (A UB) = 0 

m^f^A U B U C) = mi, 2 , 3 (A U B U C) = 0 



since there is no conflicting mass to redistribute to apply PCR 
principle, whereas the averaging fusion rule gives 

<sr Be (A) = | -( 1 + 0 + 0 )=*! 

<Sr fle (AuB) = i -(0 + 1 + 0) = | 
mtZ a9e {A uBuC) = i-(0 + 0 + l) = i 

Examples 4 (Bayesian non-binary bba’s): where PCR6 
differs from the Averaging Rule. 

Let’s consider the frame 0 = (A, B} with Shafer’s model 
and the Bayesian bba’s to combine as given in Table V. 



Table V. LIST OF BBA’S TO COMBINE FOR EXAMPLE 4. 



bba’s \ Focal elem. 


A 


B 


ahb = 0 


mi(.) 


0.2 


0.8 


0 


m 2 (•) 


0.6 


0.4 


0 


m 3 (•) 


0.7 


0.3 


0 


mi, 2,3(0 


0.084 


0.096 


0.820 



The total conflicting mass mi, 2 , 3 (Anf? = 0) = 0.82 = 1 — 
TOi(A)m 2 (A)m 3 (A) — m\{B)m 2 {B)m^{B) equals the sum 
of partial conflicting masses that will be redistributed through 
PCR principle in PCR6 

mi,2,3(A fl B = 0) = mi(A)m2(B)m3(B) 

V ' V* y 

0.024 

+ m 2 {A)mi(B)rri 2 ,(B) + rri 2 ,(A)m\(B)m 2 {B) 

0.144 0.224 

+ mi(B)m2(A)m3(A) +m2(B)mi(A)m3(A) 

S V * s v y 

0.336 0.056 

+ m3(B)mi(A)m2(A) = 0.82 

S V * 

0.036 

Applying PCR principle for each of these six partial conflicts, 
one gets: 

• for mi(A)m, 2 {B)m 3 (B) = 0.2 • 0.4 • 0.3 = 0.024 

3ft (A) = Vl (B) = 0.024 

0.2 0.4 + 0.3 0.2 + 0.3 + 0.4 

whence x\ (A) ss 0.005333 and yi(B) ~ 0.018667. 

• for m 2 {A)mi(B)rri 3 (B) = 0.6 • 0.8 • 0.3 = 0.144 

x 2 (A) = y 2 (B) = 0.144 

0.6 0.8 + 0.3 0.6 + 0.8 + 0.3 

whence X 2 (A) « 0.050824 and 112 (B) « 0.093176. 

• for rri 3 (A)mi(B)m 2 (B) = 0.7 • 0.8 • 0.4 = 0.224 

x 3 (A) = y 3 (B) = 0.224 

0.7 0.8 + 0.4 0.7 + 0.8 + 0.4 

whence x 3 (A) « 0.082526 and y 3 (B) « 0.141474. 

• for mi(B)m 2 (A)m 3 (A) = 0.8 • 0.6 ■ 0.7 = 0.336 

214 (A) 2/4 (B) 0.336 

0.6 + 0.7 “ 0.8 “ 0.8 + 0.6 + 0.7 
whence 214 (A) « 0.208000 and yi(B) k, 0.128000. 
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• for rri 2 (B)mi(A)mz(A) = 0.4 • 0.2 ■ 0.7 = 0.056 



IV. Application to probability estimation 



xs(A) __ y${B) _ 0.056 

0.2 + 0.7 “ 0.4 ~ 0.4 + 0.2 + 0.7 

whence £ 5 (A) « 0.038769 and y 5 (.B) « 0.017231. 

• for m^{B)m\{A)m 2 {A) = 0.3 • 0.2 • 0.6 = 0.036 

x 6 (A) _ t/ 6 (B) _ 0.036 

0.2 + 0.6 “ 0.3 " 0.3 + 0.2 + 0.6 

whence xq(A) ~ 0.026182 and ye(B) « 0.009818. 
Therefore, with PCR 6 one finally gets 



Let’s review a simple coin tossing random experiment. 
When we flip a coin [13], there are two possible outcomes. The 
coin could land showing a head (H) or a tail (T). The list of all 
possible outcomes is called the sample space and correspond 
to the frame 0 = {H,T}. There exist many interpretations 
of probability [14] that are out of the scope of this paper. We 
focus here on the estimation of the probability measure P{H) 
of a given coin (biased or not) based on n outcomes of a coin 
tossing experiment. The long-run frequentist interpretation of 
probability [15] considers that the probability of an event 
A is its relative frequency of occurrence over time after 
repeating the experiment a large number of times under similar 
circumstances, that is 



6 

mf£J 6 (A) = 0.084 + Y X M) = 0.495634 

i— 1 
6 

m i,2,3 6 (B) = 0.096 + Vi( A ) = 0.504366 

i= 1 

whereas the Averaging Rule (2) will give us 

mf™Z a9e (A) = i • (0.2 + 0.6 + 0.7) = ^ = 0.5 

mf'2% age (B) = i • (0.8 + 0.4 + 0.3) = = 0.5 

In this example, the intersection of focal elements is empty 
but the bba’s to combine are not binary. Therefore the total 
conflict between sources is not total and the theorem doesn’t 
apply and so PCR 6 results differ from the Averaging Rule. 

It however can happen that in some very particular sym- 
metric cases PCR 6 coincides with the Averaging Rule. For 
example, if we consider the bba’s as given in the Table VI. 
In such case the opinion of source #1 totally balances opinion 
of source #3, and the opinion of source #2 cannot support A 
more than B (and reciprocally), so that the fusion problem 
is totally symmetrical. In this example, it is expected that the 
final fusion result should commit an equal mass of belief to A 
and to B. And indeed, it can be easily verified that one gets 
in such case 



(A) =mf*% a9e (A)= 0.5 
m^f{B)=m^% a9e {B) = 0.5 

which makes perfectly sense. Note that the Averaging Rule 
provides same result on example 4 which is somehow ques- 
tionable because example 4 doesn’t present an inherent sym- 
metrical structure. In our opinion PCR 6 presents the advantage 
to respond more adequately to the change of inherent internal 
structure (asymmetry) of bba’s to combine, which is not well 
captured by the simple averaging fusion rule. 



P(A) = lim (12) 

n— >oo XL 

where n(A) denotes the number of occurrences of an event 
A in n > 0 trials. In practice however, we usually estimate 
the probability of an event A based only on a limited number 
of data (observations) that are available, and so we estimate 
the idealistic P(A) defined in (12), by classical Laplace’s 
probability definition 

P(A\n(A),n) = (13) 

Naturally, P(A) > 0 because n(A) > 0 and n > 0, and 
P{A) < 1 because we cannot get n(A) > n in a series of 
n trials. P(A) + P(A) = 1 because ^ + 

= 1 w h ere A is the complement of A in the sample 

space. 

It is interesting to note that the classical estimation of the 
probability measure given by (13) corresponds in fact to the 
simple averaging fusion rule of distinct pieces of evidence 
represented by binary masses. For example, let’s take a coin 
and flip it n = 8 times and assume for instance that we observe 
the following series of outcomes {oi = //, 02 = //, 03 = 
T, 04 = fT, 05 = T, oq = H, 07 = H, Os = T}, so that 
n(H ) = 5 and n(T ) = 3. Then these observations can be 
associated with distinct sources of evidences providing to the 
following basic (binary) belief assignments: 



Table VII. OUTCOMES OF A COIN TOSSING EXPERIMENT. 



bba’s \ Focal elem. 


H 


T 


mi(.) 


1 


0 


m 2 {.) 


1 


0 


m 3 {.) 


0 


1 


m 4 (.) 


1 


0 


™i(') 


0 


1 


m 6 (.) 


1 


0 


m 7 (.) 


1 


0 


“!(■) 


0 


1 



It is clear that the probability estimate in (13) equals the 
averaging fusion rule ( 2 ) and in such example because 



Table VI. A BAYESIAN NON-BINARY SYMMETRIC EXAMPLE. 



bba’s \ Focal elem. 


A 


B 


A n B = 0 


mi(.) 


0.2 


0.8 


0 


m 2 (.) 


0.5 


0.5 


0 


m 3 {.) 


0.8 


0.2 


0 


m l,2,3 (■) 


0.08 


0.08 


0.84 



P(H\{o 1 ,o 2 ,...,o 8 }) = = | by eq. (13) 

XI O 

= i(l + l + 0 + l + 0 + l + l + 0 ) 

= ™?ZT(H) by eq. ( 2 ) 
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